--- --- Splines and Generalized Additive Models

Goals for Today

  1. Walk away understanding how splines are constructed and fit.
  2. Start to understand some of the strengths and weaknesses of non-linear modelling.
  3. Understand the basics of fitting a generalized additive model in R.

Where would this sit in a full course?

Before today

We couldn’t really jump into this content on day one of a course. Since fitting, analyzing, and visualizing generalized additive models is all done in R, the preceding weeks would be dedicated to teaching the basics of R programming, data wrangling and visualization, and I would probably utilize some revised version of these materials from my 2017 LSA Institute course for that:

Similarly, we’d also want to lay some foundation in statistical analysis and regression modelling. For that, we’d either continue on with my Institute materials, or Danielle Navarro’s much praised Learning Statistics with R.

After today

After understanding the basics how how splines work, and how we fit them using generalized additive models, topics for future meetings would be about

  • how to interpret the results of the model (more complex models = harder to interpret)
  • how to visualize the model fits
  • how to complexify our models in important ways (random effects, autocorrelation)
  • how to get interesting derivative information from the model (like, rate of change, acceleration of change)

Why Use Splines?

Straight Lines Are Simple

The most common kind of modelling technique we use when we have data to analyze is a linear regression. For example, here’s two linear regressions that are modelling the relationship between date of birth and normalized F1 (aka vowel height) for two allophones of /ay/ in Philadelphia.

This approach is useful, because first off, these straight lines don’t look to correct, and they boil the complex social and mathematical relationships involved in the actual data down to just two numbers:

  1. The intercept of the line (the \(y\) value when \(x=0\)).
  2. The slope (the degree to which \(y\) changes for every increase of 1 for \(x\)).

Two numbers are easy to look at, and easy to understand. It’s also the case that if the data was just a little bit different, these numbers wouldn’t change very much.1

The linear modelling approach is also ubiquitous. Every fancy or state of the art elaboration on linear modelling is, at its core, still trying to describe the relationship between variables in terms of intercepts and slopes, including:

  • logistic regression
  • poisson regression
  • zero inflated poisson regression
  • mixed effects models

More or less, my advice would if you can get away with using a linear regression or some fancier version of it, go for it.

Straight lines are too simple sometimes

Sometimes the kind of data we’re working with is obviously more complex than could reasonably be modeled using straight lines. Here’s a plot of the formant trajectory of one token of [ay] moving through the F1xF2 space, and below that is just the F1 trajectory of [ay] over time.

We could try to model these trajectories using straight lines. But if we do this on a subset of /ay/ formant track data from a subset of people in the PNC, the linear regression lines in blue are losing some important information that we can see is present in the data.

What we want is some way of characterizing an arbitrary curvy line like the single F1 formant track below, using just a few numbers, in a way that is (relatively) easy to interpret and extrapolate from.

One way you might have seen of doing this is utilizing “polynomial” regression.

model <- lm(F1 ~ t_prop + I(t_prop^2) + I(t_prop^3), data = ay_single)

This gets closer to fitting the data, but the big problem here is understanding what on earth \(t^3\) means!

“Basis” functions

One way to approximate an arbitrary curve with a small number of coefficients is to use a “basis” function. The basis function is a collection of curves that you can scale, then sum over, to approximate the curve of interest. If you’ve ever taken intro to phonetics, you’re actually already familiar with one basis function, even if you’re not aware of it!

The Fourier Transform!

What is the curve of greatest interest to linguists if not a waveform?

This waveform is from a very short clip (16 milliseconds) of me saying [i]. Just this very short clip has 742 data points (or samples) in it. But we can approximate the waveform with a much smaller number of values using the Fourier transform.

This is an example of a very small Fourier basis. It’s a collection of 4 sine and cosine curves across the interval of interest. One thing you can see is that each next curve is oscillating at a higher frequency. One of them only has one peak in the interval, while the next has two, then three, etc.

The way you approximate an arbitrary curve with this collection of curves is by scaling each one. That is, you make it bigger or smaller by multiplying it by some number. Then, you scan across the x axis, summing up the values to get the estimate for the curve.

Now, 4 curves are not enough to capture the complexity of the waveform above. With a larger number of curves, each oscillating at a faster and faster rate, we can get closer. For example, the approximation we get with a Fourier basis of 15 curves is pretty close, but still over-smoothing it.

Here’s an animation of how much closer and closer our model can approximate the waveform as we increase the number of curves in the basis.

This illustrates an important point that we need to take seriously when we’re fitting our own models for real:

Bigger Basis = More Wiggly

The next kinda fun thing to look at is the actual weights that we’re using to scale each curve in the Fourier Basis. Here’s a little plot of all of the coefficients, color coded black for the biggest values and white for smallest values, stacked on top of eachother in a single column.

If we did a bit more math, and then did this same curve fitting process in a rolling window across an entire audio file, and at each next window placed the column of coefficients for each window next to the one before, the graph would start to look like something pretty familiar.

If we come back to our original question about how to approximate a curve matching the F1 trajectory of [ay] above, we can actually also use a Fourier basis! The Fourier transform can be used to described any curve.

However, the Fourier basis is particularly good for waveforms becuase we know that they consist of oscillating curves, and the fourier basis is made up of oscillating curves. This means we can actually look at the coefficients from the Fourier transform and they’ll have some kind of meaning to us. This isn’t true for most curves we’re interested in, though, like the F1 formant trajectory. In their 2006 book on Functional Data Analysis, Ramsay & Silverman say

A Fourier series is like margarine: It’s cheap and you can spread it on practically anything, but don’t expect that the result will be exciting eating.

Splines, a.k.a. Rosemary Truffle Aioli

There are lots of different kinds of splines, but they all basically look like some variant of this:

Just like we saw before, we can re-scale each of these curves then sum over them to get some new estimated curve.

Here’s a little animation showing a range of different curves we can get out of a b-spline basis this size:

And it’s also important to also remember:

Bigger Basis = More Wiggly

Knots

Now is the time to introduce a new notion: knots. You’ll most often see people talking about splines in terms of “knots” or “control points”. If you’ve everplayed around with making vector graphics in something like Photoshop, Illustrator, or Inkscape, you’ll be kind of familiar with with the notion of “knots”. Here’s a figure of a line I drew in Inkscape that has one knot.

The relationship between the number of “knots” and the number of basis functions is a little bit complicated2, but usually the more knots, the more basis polynomials there are, so we can now further revise our saying about wiggliness.

More Knots = More Wiggly

The basis we were looking at above actually had only one knot, but even with that we can fit the curve of the that single formant track really well.

Interpretation… not so straightforward

The downside to splines is that the coefficients we get for each polynomial aren’t really interpretable. Here are the 5 coefficients that are being used to fit the curve to the formant trajectory.

      X1       X2       X3       X4       X5 
743.0011 817.7983 820.2886 585.1586 593.9880 

These don’t really mean anything like the fourier transform coefficients did. Interpreting the results of fitting a model with splines will take a little bit more work. For the most part, it will involve making visualizations of the data.ah

The Math Basics

Here are two questions that are pretty reasionable right now:

  1. Ok, how do i get these curves?
  2. Isn’t multiplying and adding all these curves together a lot of work?

The answers are “the computer gives them to us” and “surprisingly no!”

For the most part, we’re not going to be constructing spline bases “by hand” (see the “Doing it in R” section below). But even then it’s not that tricky. The splines package will let us construct a b-spline basis as easilly as this:

library(splines)
bspline_basis <- bs(1:10, df = 5, intercept = T)
bspline_basis

Now lets say we choose the following 5 coefficients

coefficients <- c(3, 6, 2, 5, 1)

Scaling each polynomial and summing up across them is as easy as:

coefficients %*% t(bspline_basis)
     [,1]     [,2]    [,3]     [,4]     [,5]     [,6]     [,7]     [,8]     [,9] [,10]
[1,]    3 4.333333 4.62963 4.333333 3.888889 3.721536 3.814815 3.710562 2.931413     1

The %*% syntax is doing matrix multiplication. Usually you can get away without thinking about matrix multiplicatio or linear algebra when doing stats, but since it underlies so many quantitative and computational methods, it’s handy to know a little bit about it.

This animation illustrates the mathematical operations that are going on when you do the matrix multiplication written out above.

Doing it in R

To fit a spline to data in R, you’ll want to use the gam() function, which stands for “generalized additive model” from the mgcv package. Let’s remind ourselves of the [ay] trajectory data.

Let’s fit a generalized additive model to this data, with the following specifications.

  • We want to enter gender into the model like we would for any linear model.
  • We want to fit a spline over t_prop with, let’s say, 10 knots.
  • We want to fit separate splines for each gender.

Here’s the code:

library(mgcv)
ay_model <- gam(F1 ~ gender + s(t_prop, by = gender, k = 10), 
                data = ay_trajectory)

Now, gam is doing a lot more fancy stuff than just finding the best coefficients for each basis polynomial. For one, it’s automatically identifying and utilizing a “smoothing parameter.” As the smoothing parameter increases, it’ll penalize the curve down to a straight line. What’s nice about this is that the resulting smooth won’t (usually) be unduely wiggly, even if you have many many knots.

If we look at a summary of ay_model, there’s some useful information, but it’s less interpretable than a normal linear regression.

summary(ay_model)

Family: gaussian 
Link function: identity 

Formula:
F1 ~ gender + s(t_prop, by = gender, k = 10)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  519.706      3.802  136.68   <2e-16 ***
genderwomen  127.337      5.510   23.11   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                        edf Ref.df     F p-value    
s(t_prop):gendermen   2.899  3.596 35.03  <2e-16 ***
s(t_prop):genderwomen 3.374  4.172 48.22  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.733   Deviance explained = 73.9%
GCV = 2449.7  Scale est. = 2385.4    n = 315

The significant gender effect tells us that, unsurprisingly, women had a higher F1 than men by about 127 Hz. The significance of the smooth terms isn’t all that meaningful: it means the smooths were significantly different than a straight flat line. The column called edf (stands for Estimated Degrees of Freedom) is useful to eyeball. If the smooth was penalized down to a straight line, it would have an an EDF of about 1. These EDFs are larger than 1, and women’s are larger than men’s (although, we’d need to do a lot more work to see if it was significantly larger), so it looks like it was worth fitting a fanicer model than just a linear model!

To really explore gam fits, we have to plot them. I’m going to spare you the code for making these plots right now, but you can scrutinize it by downloading the code from the little drop down button at the top of the page.

But here’s an important warning!

Don’t just look to see if the confidence intervals of the smooths are overlapping or not!

You can’t really really tell if two curves are significantly different from just looking at whether their confidence intervals overlap, cause that’s not really how confidence intervals work. Instead, we need to look at a difference curve, which we can get from a gam model.

Recap, and Further Issues

Recap

  1. Splines are fit using a “basis” of polynomial curves which are scaled and summed to approximate a curve.
  2. One of the strengths of using non-linear methods is that you can model relationships that are clearly non linear. Some weaknesses is that it is more challenging to intepret the results, and we need to be careful of over-fitting.
  3. You can fit a spline using generalized additive models, specifically as implemented in the mgcv package.

Things you’ll want gams to do, and they can

Random Intercepts

Each speaker has their own personal average F1 value for their [ay] that won’t be perfectly captured by their demographic features. For this, you’ll want to fit random intercepts by speaker, and you can.

Random smooths

Each speaker will also have their own personal F1 trajectories, and to capture this, you’ll want to fit random smooths (like random slopes), and you can.

Autocorrelation

Importantly, the data we’re modelling aren’t just a random association of data points between time stamps and F1 values, like this plot shows.

Each F1 value is directly connected to an F1 value at the next time point. That is, they are auto-correlated.

You can capture this with gams also!

Multidimensional Smooths

The most fun thing to do with gams is to fit multidimensional smooths. For example, how does the trajectory of F1 change over dates of birth? I’ve included the code for that model here, just so you can see how simple an extension it is of what we’ve already done.

big_ay_model <- gam(F1 ~ te(t_prop, dob, k = 10)+
                       s(idstring, bs = 're'), 
                     data = ay0_m)


  1. It has low variance, but high bias.

  2. The number of basis functions is = knots + degree - 2. The “degree” of the basis functions controls how the curves connect smoothly at the knots. A degree of 3 makes the rate of change smooth, and a degree of 4 also makes the acceleration smooth. A degree of 3 or 4 is usually what you’ll want to be working with.

---
title: "Splines and Generalized Additive Models"
output: 
  html_notebook: 
    code_folding: none
    css: custom.css
    theme: flatly
    toc: yes
    toc_float: yes
    toc_depth: 3
---

```{r echo = F}
library(tidyverse)
library(splines)
library(tuneR)
library(ggthemes)
library(gganimate)
library(fda)
```


# Goals for Today

1. Walk away understanding how splines are constructed and fit.
2. Start to understand some of the strengths and weaknesses of non-linear modelling.
3. Understand the basics of fitting a generalized additive model in R.

## Where would this sit in a full course?

### Before today

We couldn't really jump into this content on day one of a course. Since fitting, analyzing, and visualizing generalized additive models is all done in R, the preceding weeks would be dedicated to teaching the basics of R programming, data wrangling and visualization, and I would probably utilize some revised version of these materials from my 2017 LSA Institute course for that:

<div style="width:100%;float:left;">
<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week1.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_1.nb.html)

</div>

<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week2.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_2.nb.html)

</div>
<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week3.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_3.nb.html)

</div>
<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week4.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_4.nb.html)

</div>
</div>

Similarly, we'd also want to lay *some* foundation in statistical analysis and regression modelling. For that, we'd either continue on with my Institute materials, or Danielle Navarro's much praised *Learning Statistics with R*.

<div style="width:100%;float:left;">
<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week5.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_5.nb.html)

</div>

<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week6.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_6.nb.html)

</div>
<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week7.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_7.nb.html)

</div>
<div class = "box sheet" style = "width:20%;float:left;margins:auto;">

[![](figures/week8.png)](https://jofrhwld.github.io/teaching/courses/2017_lsa/lectures/Session_8.nb.html)

</div>
</div>


<div style="width:100%;float:left;">
<div class = "box sheet" style = "width:20%;float:left;mmargin-right:auto;margin-left:auto;">

[![](figures/lswr.png)](https://learningstatisticswithr.com)

</div>
</div>

### After today

After understanding the basics how how splines work, and how we fit them using generalized additive models, topics for future meetings would be about

- how to interpret the results of the model (more complex models = harder to interpret)
- how to visualize the model fits
- how to complexify our models in important ways (random effects, autocorrelation)
- how to get interesting derivative information from the model (like, rate of change, acceleration of change)


# Why Use Splines?

## Straight Lines Are Simple

The most common kind of modelling technique we use when we have data to analyze is a *linear regression*.
For example, here's two linear regressions that are modelling the relationship between date of birth and normalized F1 (aka vowel height) for two allophones of /ay/ in Philadelphia.

<div style = "width:75%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
all_meas_df <- read_rds("data/all_meas_df.rds")
ays <- all_meas_df %>% filter(plt_vclass %in% c("ay0", "ay"))
ays %>%
  mutate(dob = year-age) %>%
  group_by(idstring, plt_vclass, dob) %>%
  summarise(F1_n = mean(F1_n))->ay_means
ay_means %>%
ggplot(aes(dob, F1_n, color = plt_vclass))+
  geom_point(alpha = 0.4)+
  scale_y_reverse()+
  stat_smooth(method = "lm")+
  scale_color_colorblind("vowel", limits = c("ay0", "ay"), labels = c("ayT", "ayV"))+
  ylab("normalized F1")+
  xlab("date of birth")+
  theme_minimal()
    
```
</div>

This approach is useful, because first off, these straight lines don't look **to** correct, and they boil the complex social and mathematical relationships involved in the actual data down to just two numbers:

1. The intercept of the line (the $y$ value when $x=0$).
2. The slope (the degree to which $y$ changes for every increase of 1 for $x$).

Two numbers are easy to look at, and easy to understand. It's also the case that if the data was just a little bit different, these numbers wouldn't change very much.[^1]


The linear modelling approach is also ubiquitous. Every fancy or state of the art elaboration on linear modelling is, at its core, still trying to describe the relationship between variables in terms of intercepts and slopes, including:

- logistic regression
- poisson regression
- zero inflated poisson regression
- mixed effects models

More or less, my advice would if you can get away with using a linear regression or some fancier version of it, go for it.

## Straight lines are too simple sometimes

Sometimes the kind of data we're working with is obviously more complex than could reasonably be modeled using straight lines. Here's a plot of the formant trajectory of one token of [ay] moving through the F1xF2 space, and below that is just the F1 trajectory of [ay] over time.

<div style = "width:75%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
my_iua <- read.csv("audio/my_iua.csv")
my_ay <- read.csv("audio/my_ay.Table")
ay_f1 <- loess(F1 ~ time, data = my_ay)
ay_f2 <- loess(F2 ~ time, data = my_ay)

pred <- my_ay
pred$F1 <- fitted(ay_f1)
pred$F2 <- fitted(ay_f2)
pred %>%
  mutate(F1_t = lag(F1),
         F2_t = lag(F2),
         F1_d = log(F1)-log(F1_t),
         F2_d = log(F2)-log(F2_t),
         vel = sqrt((F1_d^2) + (F2_d^2)),
         time2 = time-median(time))%>%
  filter(is.finite(vel))->pred


ggplot(my_iua, aes(F2, F1))+
    geom_text(aes(label = vclass))+
    geom_path(data = pred, aes(color = time2))+
    geom_point(data = pred, aes(size=1/vel, color = time2))+
    scale_y_reverse()+
    scale_x_reverse()+
    scale_size(guide = "none")+
    scale_color_gradient2("centered time",
                               low = "#4477AA", 
                               mid =  "#DDCC77",
                               high = "#CC6677")+
    theme_minimal()
```

```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ggplot(pred, aes(time2, F1))+
    geom_path(data = pred, aes(color = time2))+
    geom_point(data = pred, aes(size=1/vel, color = time2))+
    scale_size(guide = "none")+
    scale_color_gradient2("centered time",
                               low = "#4477AA", 
                               mid =  "#DDCC77",
                               high = "#CC6677")+
    xlab("centered time")+
    theme_minimal()
```


</div>

We *could* try to model these trajectories using straight lines. But if we do this on a subset of /ay/ formant track data from a subset of people in the PNC, the linear regression lines in blue are losing some important information that we can see is present in the data.

```{r echo = F}
ay0_df <- readRDS("data/ay0_df.rds")
ay0_df %>%
  ungroup() %>%
  mutate(gender = case_when(sex == "f" ~ "women",
                            TRUE ~ "men"),
         idstring = factor(idstring)) %>%
  group_by(idstring, id) %>%
  mutate(t_min = t - min(t),
         t_prop = t_min/max(t_min),
         n = 1:n())->ay0_df
```



<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay0_df %>%
  filter(dob %in% c(1964, 1965))%>%
  group_by(idstring, dob, gender, n)%>%
  summarise(F1 = mean(F1),
            t_prop = mean(t_prop)) -> ay_trajectory

ay_trajectory %>%
  ggplot(aes(t_prop, F1))+
    geom_line(aes(group = idstring), color = "grey50")+
    stat_smooth(method = "lm")+
    facet_wrap(~gender) + 
    theme_bw()+
    ggtitle("dob = 1964, 1965, linear model")
```
</div>

What we want is some way of characterizing an arbitrary curvy line like the single F1 formant track below, using just a few numbers, in a way that is (relatively) easy to interpret and extrapolate from.

<div style = "width:85%;margin-right:auto;margin-left:auto;">

```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay_trajectory %>%
  group_by(idstring)%>%
  mutate(first = F1[1]) %>%
  filter(first > 710) ->ay_single
ay_single %>%
  ggplot(aes(t_prop, F1))+
    geom_point()+
    geom_line()+
    theme_minimal()
```
</div>

One way you might have seen of doing this is utilizing "polynomial" regression.

```{r}
model <- lm(F1 ~ t_prop + I(t_prop^2) + I(t_prop^3), data = ay_single)
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay_single %>%
  mutate(pred = predict(model)) %>%
  ggplot(aes(t_prop, F1))+
    geom_point()+
    geom_line()+
    geom_line(aes(y = pred), color = "red")+
    theme_minimal()
```
</div>

This gets closer to fitting the data, but the big problem here is understanding what on earth $t^3$ means!


# "Basis" functions

One way to approximate an arbitrary curve with a small number of coefficients is to use a "basis" function.
The basis function is a collection of curves that you can scale, then sum over, to approximate the curve of interest.
If you've ever taken intro to phonetics, you're actually already familiar with one basis function, even if you're not aware of it!


## The Fourier Transform!

What is the curve of greatest interest to linguists if not a waveform? 

```{r echo = F}
wav <- readWave("audio/shotwav.wav")
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r echo = F}
tibble(pressure = c(0,wav@left,0)) %>%
  mutate(samp = 1:n()) ->wav_df
wav_df %>%
  ggplot(aes(samp, pressure))+
    geom_line()+
    theme_void()
```
</div>

This waveform is from a very short clip (16 milliseconds) of me saying [i].
Just this very short clip has 742 data points (or samples) in it.
But we can approximate the waveform with a much smaller number of values using the Fourier transform.

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r echo = F}
fourier <- create.fourier.basis(rangeval = c(1,max(wav_df$samp)),
                     nbasis = 5)
f_basis <- eval.basis(1:742, fourier)[,-1]
as.data.frame(f_basis) %>%
  mutate(samp = 1:n()) ->fourier_df
fourier_df %>%
  gather(curve, value, -samp) %>%
  mutate(n = 1:n())->long

long %>%
  mutate(curven = as.numeric(factor(curve, levels = colnames(f_basis)))) ->long

long %>%
  ggplot(aes(samp, value, color = curve))+
    geom_line(aes(group = curve), size = 2)+
    scale_color_colorblind(guide = "none")+
    theme_void()+
    ggtitle("A Fourier Basis!")
```

</div>

This is an example of a very small Fourier basis.
It's a collection of 4 sine and cosine curves across the interval of interest.
One thing you can see is that each next curve is oscillating at a higher frequency.
One of them only has one peak in the interval, while the next has two, then three, etc.

The way you approximate an arbitrary curve with this collection of curves is by scaling each one.
That is, you make it bigger or smaller by multiplying it by some number.
Then, you scan across the x axis, summing up the values to get the estimate for the curve.


```{r echo = F}
curve_order <- colnames(fourier_df)
curve0 <- long %>% mutate(step = 1)
curve1 <- long %>%
            mutate(value = case_when(curve == curve_order[1] ~ value*0.5,
                                     TRUE ~ value),
                   step = 2)
curve2 <- curve1 %>%
            mutate(value = case_when(curve == curve_order[2]  ~ value*0.75,
                                     TRUE ~ value),
                   step = 3)
curve3 <- curve2 %>%
            mutate(value = case_when(curve == curve_order[3] ~ value*1.2,
                                     TRUE ~ value),
                   step = 4)
curve4 <- curve3 %>%
            mutate(value = case_when(curve == curve_order[4] ~ value*0.1,
                                     TRUE ~ value),
                   step = 5)
curve5 <- curve4 %>%
            mutate(value = case_when(curve == curve_order[5] ~ value*0.2,
                                     TRUE ~ value),
                   step = 6)
bind_rows(curve0, curve1, curve2, curve3,curve4, curve5) %>%
  ggplot(aes(samp, value, color = curve))+
    geom_line(aes(group = curve), size = 2)+
    scale_color_colorblind(guide = "none")+
    theme_void()+
    ggtitle("A Fourier Basis!")+
    transition_states(step,
                    transition_length = 2,
                    state_length = 1)->anim1
anim_save(filename = "figures/weight_fourier.gif", animation = anim1)
  
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/weight_fourier.gif)
</div>


<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r echo = F}
curve5 %>%
  group_by(samp) %>%
  summarise(value = sum(value))->final_curve

curve5 %>%
ggplot(aes(samp, value, color = curve))+
    geom_line(aes(group = curve), size = 2)+
    geom_line(data = final_curve, color = "red", linetype = 2, size = 2)+
    scale_color_colorblind(guide = "none")+
    theme_void()+
    ggtitle("A Fourier Basis!")

```
</div>

Now, 4 curves are not enough to capture the complexity of the waveform above.
With a larger number of curves, each oscillating at a faster and faster rate, we can get closer.
For example, the approximation we get with a Fourier basis of 15 curves is pretty close, but still over-smoothing it.

```{r echo = F}
fourier <- create.fourier.basis(rangeval = c(1,max(wav_df$samp)),
                     nbasis = 15)
f_basis <- eval.basis(1:742, fourier)
as.data.frame(f_basis) %>%
  mutate(samp = 1:n()) ->fourier_df


wav_df %>%
  left_join(fourier_df) %>%
  select(-samp, -const) %>%
  lm(pressure ~ ., data=.) -> small_model
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r echo = F}
wav_df %>%
  mutate(pred = predict(small_model))%>%
  ggplot(aes(samp, pressure)) +
    geom_line()+
    geom_line(aes(y = pred), color = "red")+
    ggtitle("nbasis = 15")+
    theme_void()
```
</div>

Here's an animation of how much closer and closer our model can approximate the waveform as we increase the number of curves in the basis.

```{r echo = F}
make_fit2 <- function(ay_single, nbasis) {
  fourier_ay <- create.fourier.basis(rangeval = c(1,nrow(ay_single)),
                       nbasis = nbasis)
  f_basis_ay <- eval.basis(1:nrow(ay_single), fourier_ay)
  as.data.frame(f_basis_ay) %>%
    mutate(samp = 1:n()) ->fourier_df_ay
  
  ay_single %>%
    ungroup() %>%
    select(pressure) %>%
    mutate(samp = 1:n()) %>%
    left_join(fourier_df_ay) %>%
    select(-samp, -const) %>%
    lm(pressure ~ ., data=.) -> ay_model
  
  ay_single %>%
    mutate(pred = predict(ay_model),
           nbasis = nbasis) ->out_df 
  out_df %>%
    ungroup() %>%
    select(samp, pressure, pred)%>%
    gather(source, value, c("pressure", "pred"))->out_df
  return(out_df)
}

odd_series2 <- seq(5,101, by = 2)

all_wav<- map_df(odd_series2, ~make_fit2(wav_df, .x), .id = "nbasis")

all_wav %>%
  mutate(nbasis = odd_series2[as.numeric(nbasis)])->all_wav

all_wav  %>%
    mutate(source = factor(source, levels = c("pressure", "pred"))) %>%
    ggplot(aes(samp, value, color = source))+
    geom_line()+
    scale_color_manual(values = c("black", "red"))+
    theme_void()+
    transition_states(nbasis)+
    ease_aes('cubic-in-out')+
    ggtitle("Fit using nbasis = {closest_state}")->anim2
anim_save(filename = "figures/fourier_fit.gif", anim2, 
          fps = 10, nframes = 300)
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/fourier_fit.gif)
</div>




```{r echo = F}
fourier_big <- create.fourier.basis(rangeval = c(1,max(wav_df$samp)),
                     nbasis = 101)
f_basis_big <- eval.basis(1:742, fourier_big)
as.data.frame(f_basis_big) %>%
  mutate(samp = 1:n()) ->fourier_df_big
wav_df %>%
  left_join(fourier_df_big) %>%
  select(-samp, -const) %>%
  lm(pressure ~ ., data=.) -> silly_model
```




<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r echo = F}
wav_df %>%
  mutate(pred = predict(silly_model))%>%
  ggplot(aes(samp, pressure)) +
    geom_line()+
    geom_line(aes(y = pred), color = "red")+
    ggtitle("nbasis = 101")+
    theme_void()
```
</div>

This illustrates an important point that we need to take seriously when we're fitting our own models for real:

<div class = "emphasize">Bigger Basis = More Wiggly</div>



The next kinda fun thing to look at is the actual *weights* that we're using to scale each curve in the Fourier Basis. Here's a little plot of all of the coefficients, color coded black for the biggest values and white for smallest values, stacked on top of eachother in a single column.

```{r, fig.width = 0.2, fig.height=1, echo = F}
tibble(coef = coef(silly_model)) %>%
  mutate(n = 1:n(),
         x = 1)  %>%
  ggplot(aes(x, n))+
    geom_tile(aes(fill = log(abs(coef))))+
    theme_void()+
    scale_fill_gradient(low = "white", high = "black", guide = "none")
  
```


If we did a bit more math, and then did this same curve fitting process in a rolling window across an entire audio file, and at each next window placed the column of coefficients for each window next to the one before, the graph would start to look like something pretty familiar.

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/i_spect.png)
</div>


If we come back to our original question about how to approximate a curve matching the F1 trajectory of [ay] above, we can actually also use a Fourier basis!
The Fourier transform can be used to described any curve.

```{r echo = F}

odd_series <- seq(1,17, by = 2)

make_fit <- function(ay_single, nbasis) {
  fourier_ay <- create.fourier.basis(rangeval = c(1,nrow(ay_single)),
                       nbasis = nbasis)
  f_basis_ay <- eval.basis(1:nrow(ay_single), fourier_ay)
  as.data.frame(f_basis_ay) %>%
    mutate(samp = 1:n()) ->fourier_df_ay
  
  ay_single %>%
    ungroup() %>%
    select(F1) %>%
    mutate(samp = 1:n()) %>%
    left_join(fourier_df_ay) %>%
    select(-samp, -const) %>%
    lm(F1 ~ ., data=.) -> ay_model
  
  ay_single %>%
    mutate(pred = predict(ay_model),
           nbasis = nbasis) ->out_df 
  out_df %>%
    ungroup() %>%
    select(t_prop, F1, pred)%>%
    gather(source, value, c("F1", "pred"))->out_df
  return(out_df)
}


all_fits <- map_df(odd_series, ~make_fit(ay_single, .x), .id = "nbasis")
all_fits %>%
  mutate(nbasis = odd_series[as.numeric(nbasis)])->all_fits

all_fits %>%
  ggplot(aes(t_prop, value, color = source, size=source, alpha = source))+
    geom_line()+
    scale_color_manual(values = c("black", "red"))+
    scale_size_manual(values = c(1,3))+
    scale_alpha_manual(values = c(1, 0.4))+
    theme_minimal()+
    transition_states(nbasis)+
    ease_aes('cubic-in-out')+
    ggtitle("Fit using nbasis = {closest_state}")->formant_anim

#anim_save(filename = "figures/fourier_formant.gif", formant_anim, 
#          fps = 10, nframes = 100)
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/fourier_formant.gif)
</div>

**However**, the Fourier basis is particularly good for waveforms becuase we know that they consist of oscillating curves, and the fourier basis is made up of oscillating curves. 
This means we can actually look at the coefficients from the Fourier transform and they'll have some kind of meaning to us.
This isn't true for most curves we're interested in, though, like the F1 formant trajectory.
In their 2006 book on Functional Data Analysis, Ramsay & Silverman say

> A Fourier series is like margarine: It’s cheap and you can spread it on practically anything, but don’t expect that the result will be exciting eating.

## Splines, a.k.a. Rosemary Truffle Aioli

There are lots of different kinds of splines, but they all basically look like some variant of this:

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
data.frame(bs(x = seq(0,1,length = 100), df = 5, intercept = T)) %>%
  mutate(time = seq(0,1,length = 100))%>%
  gather(poly,
         value,
         -time)->bs_basis_df

bs_basis_df %>%
  ggplot(aes(time, value, color = poly))+
    geom_line(size = 1)+
    scale_color_colorblind(guide = "none")+
    ggtitle("A b-spline Basis!")+
    theme_minimal()
```
</div>

Just like we saw before, we can re-scale each of these curves then sum over them to get some new estimated curve.

```{r echo=F}
b_spline_coefs <- runif(5, -1, 1)
```

```{r echo = F}
curve_order <- unique(bs_basis_df$poly)
curve0 <- bs_basis_df %>% mutate(step = 1)
curve1 <- bs_basis_df %>%
            mutate(value = case_when(poly == curve_order[1] ~ value * b_spline_coefs[1],
                                     TRUE ~ value),
                   step = 2)
curve2 <- curve1 %>%
            mutate(value = case_when(poly == curve_order[2]  ~ value* b_spline_coefs[2],
                                     TRUE ~ value),
                   step = 3)
curve3 <- curve2 %>%
            mutate(value = case_when(poly == curve_order[3] ~ value* b_spline_coefs[3],
                                     TRUE ~ value),
                   step = 4)
curve4 <- curve3 %>%
            mutate(value = case_when(poly == curve_order[4] ~ value* b_spline_coefs[4],
                                     TRUE ~ value),
                   step = 5)
curve5 <- curve4 %>%
            mutate(value = case_when(poly == curve_order[5] ~ value* b_spline_coefs[5],
                                     TRUE ~ value),
                   step = 6)
bind_rows(curve0, curve1, curve2, curve3,curve4, curve5) %>%
  ggplot(aes(time, value, color = poly))+
    geom_line(aes(group = poly), size = 2)+
    scale_color_colorblind(guide = "none")+
    theme_minimal()+
    ggtitle("A b-spline Basis!")+
    transition_states(step,
                    transition_length = 2,
                    state_length = 1)+
    ease_aes("cubic-in-out")->anim_bs
#anim_save(filename = "figures/weight_bs.gif", 
#          animation = anim_bs)
  
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/weight_bs.gif)
</div>


<div style = "width:85%;margin-right:auto;margin-left:auto;">

```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
curve5 %>%
  group_by(time)%>%
  summarise(value = sum(value))->bs_curve

curve5 %>%
  ggplot(aes(time, value, color = poly))+
    geom_line(aes(group = poly), size = 1)+
    geom_line(data = bs_curve, color = "red", linetype = 2,
              size = 1)+
    scale_color_colorblind(guide = "none")+
    theme_minimal()+
    expand_limits(y = 1)+
    ggtitle("A b-spline Basis!")
```
</div>


Here's a little animation showing a range of different curves we can get out of a b-spline basis this size:


```{r echo = F}
bs_mat <- bs(x = seq(0,1,length = 100), df = 5, intercept = T)
random_coefs <- matrix(runif(5*20, -1, 1), ncol = 5)
random_fits <- random_coefs %*% t(bs_mat)
data.frame(t(random_fits)) %>%
  mutate(time = seq(0,1,length = 100)) %>%
  gather(fit, value, -time) %>%
  ggplot(aes(time, value))+
    geom_line(color = "red")+
    theme_minimal()+
    ggtitle("lots of different bspline fits")+
    transition_states(fit,
                    transition_length = 2,
                    state_length = 1)+
    ease_aes("cubic-in-out")->anim_bs_many
  
#anim_save(filename = "figures/many_bs.gif", 
#          animation = anim_bs_many)
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/many_bs.gif)
</div>

And it's also important to also remember:

<div class = "emphasize">Bigger Basis = More Wiggly</div>

```{r echo = F}
spline_examples <- runif(20, 0, 1)
df_range <- 5:20

fit_spline <- function(x, coefs){
  basis <- bs(seq(0,1, length = 100), df = x, intercept = T)
  fit <- (coefs[1:x] %*% t(basis))[1,]
  tibble(time = seq(0,1, length = 100),
         fit = fit,
         nspline = x)->out
  return(out)
}

map_df(df_range, fit_spline, coefs = spline_examples) %>%
  ggplot(aes(time, fit))+
    geom_line() +
    theme_minimal()+
    ggtitle("nbasis = {closest_state}")+
    transition_states(nspline)+
    ease_aes("cubic-in-out") -> wiggly_anim

#anim_save(filename = "figures/wiggly_bs.gif", 
#          animation = wiggly_anim)
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/wiggly_bs.gif)
</div>

### Knots

Now is the time to introduce a new notion: **knots**.
You'll most often see people talking about splines in terms of "knots" or "control points". 
If you've everplayed around with making vector graphics in something like Photoshop, Illustrator, or Inkscape, you'll be kind of familiar with with the notion of "knots". 
Here's a figure of a line I drew in Inkscape that has one knot.

<div style = "width:60%;margin-right:auto;margin-left:auto;">
![](figures/knots.png)
</div>

The relationship between the number of "knots" and the number of basis functions is a little bit complicated[^2], but *usually* the more knots, the more basis polynomials there are, so we can now further revise our saying about wiggliness.

<div class = "emphasize">More Knots = More Wiggly</div>

The basis we were looking at above actually had only one knot, but even with that we can fit the curve of the that single formant track really well.

```{r echo = F}
bs_df <- data.frame(bs(x = 1:nrow(ay_single), df = 5, intercept = T)) %>%
          mutate(t_prop = ay_single$t_prop)


ay_single %>%
  ungroup()%>%
  select(t_prop, F1)%>%
  left_join(bs_df)%>%
  select(-t_prop)%>%
  lm(F1 ~ .-1, data  = .) -> bs_mod
```




<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%", message = F, warning= F}
ay_single %>%
  ungroup()%>%
  mutate(pred = predict(bs_mod))%>%
  select(t_prop, F1, pred)%>%
  gather(source, value, -t_prop) %>%
  ggplot(aes(t_prop, value, color = source, size=source, alpha = source))+
    geom_line()+
    scale_color_manual(values = c("black", "red"))+
    scale_size_manual(values = c(1,3))+
    scale_alpha_manual(values = c(1, 0.4))+
    theme_minimal()
```
</div>

### Interpretation... not so straightforward

The downside to splines is that the coefficients we get for each polynomial aren't *really* interpretable. Here are the 5 coefficients that are being used to fit the curve to the formant trajectory.

```{r echo = f}
coef(bs_mod)
```

These don't really *mean* anything like the fourier transform coefficients did.
Interpreting the results of fitting a model with splines will take a little bit more work.
For the most part, it will involve making visualizations of the data.ah








# The Math Basics

Here are two questions that are pretty reasionable right now:

1. Ok, how do i *get* these curves?
2. Isn't multiplying and adding all these curves together a lot of work?

The answers are "the computer gives them to us" and "surprisingly no!"

For the most part, we're not going to be constructing spline bases "by hand" (see the "Doing it in R" section below).
But even then it's not that tricky. 
The `splines` package will let us construct a b-spline basis as easilly as this:

```{r}
library(splines)
bspline_basis <- bs(1:10, df = 5, intercept = T)
bspline_basis
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%", message = F, warning= F}
matplot(1:10, bspline_basis, type = 'l')
```
</div>

Now lets say we choose the following 5 coefficients

```{r message=FALSE, warning=FALSE, paged.print=FALSE}
coefficients <- c(3, 6, 2, 5, 1)
```

Scaling each polynomial and summing up across them is as easy as:

```{r}
coefficients %*% t(bspline_basis)
```

The `%*%` syntax is doing matrix multiplication.
Usually you can get away without thinking about matrix multiplicatio or linear algebra when doing stats, but since it underlies so many quantitative and computational methods, it's handy to know a little bit about it.

This animation illustrates the mathematical operations that are going on when you do the matrix multiplication written out above.

![](figures/foo.gif)


# Doing it in R

To fit a spline to data in R, you'll want to use the `gam()` function, which stands for "generalized additive model" from the `mgcv` package. 
Let's remind ourselves of the [ay] trajectory data.

```{r echo = F}
ay_trajectory <- ay_trajectory %>%
  ungroup()%>%
  mutate(gender = factor(gender))
```


<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay0_df %>%
  filter(dob %in% c(1964, 1965))%>%
  group_by(idstring, dob, gender, n)%>%
  summarise(F1 = mean(F1),
            t_prop = mean(t_prop)) -> ay_trajectory

ay_trajectory %>%
  ggplot(aes(t_prop, F1))+
    geom_line(aes(group = idstring))+
    #stat_smooth(method = "lm")+
    facet_wrap(~gender) + 
    theme_bw()+
    ggtitle("dob = 1964, 1965, linear model")
```
</div>

Let's fit a generalized additive model to this data, with the following specifications.

- We want to enter gender into the model like we would for any linear model.
- We want to fit a spline over `t_prop` with, let's say, 10 knots.
- We want to fit separate splines for each gender.

Here's the code:

```{r}
library(mgcv)
ay_model <- gam(F1 ~ gender + s(t_prop, by = gender, k = 10), 
                data = ay_trajectory)
```

Now, `gam` is doing a lot more fancy stuff than just finding the best coefficients for each basis polynomial. 
For one, it's automatically identifying and utilizing a "smoothing parameter."
As the smoothing parameter increases, it'll penalize the curve down to a straight line.
What's nice about this is that the resulting smooth won't (usually) be unduely wiggly, even if you have many many knots.

If we look at a summary of `ay_model`, there's some useful information, but it's less interpretable than a normal linear regression.

```{r}
summary(ay_model)
```

The significant gender effect tells us that, unsurprisingly, women had a higher F1 than men by about 127 Hz.
The significance of the smooth terms isn't all that meaningful: it means the smooths were significantly different than a straight flat line.
The column called `edf` (stands for Estimated Degrees of Freedom) is useful to eyeball.
If the smooth was penalized down to a straight line, it would have an an EDF of about 1.
These EDFs are larger than 1, and women's are larger than men's (although, we'd need to do a lot more work to see if it was *significantly* larger), so it looks like it was worth fitting a fanicer model than just a linear model!

```{r echo = F}
library(itsadug)
```
```{r echo = F}
ay_pred <- get_predictions(ay_model, cond = list(t_prop = seq(0,1,length = 15),
                                      gender = c("men", "women")),
                           print.summary = F)
```

To really explore gam fits, we have to plot them. 
I'm going to spare you the code for making these plots right now, but you can scrutinize it by downloading the code from the little drop down button at the top of the page.

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay_pred %>%
  ggplot(aes(t_prop, fit))+
    geom_ribbon(aes(ymin = fit-(1.96 * CI), 
                    ymax = fit + (1.96 * CI), 
                    fill = gender),
                alpha = 0.4)+
    geom_line(aes(color = gender))+
    scale_color_colorblind()+
    scale_fill_colorblind()+
    theme_minimal()
```
</div>

But here's an important warning!

<div class = "emphasize">Don't just look to see if the confidence intervals of the smooths are overlapping or not!</div>

You can't really really tell if two curves are significantly different from just looking at whether their confidence intervals overlap, cause that's not really how confidence intervals work.
Instead, we need to look at a difference curve, which we *can* get from a gam model.

<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
get_difference(ay_model, 
               comp = list(gender = c("women", "men")),
               cond = list(t_prop = seq(0,1, length = 15)),
               print.summary = F) -> difference_curve

difference_curve %>%
  ggplot(aes(t_prop, difference))+
    geom_ribbon(aes(ymin = difference-(1.96 * CI), 
                    ymax = difference + (1.96 * CI)),
                alpha = 0.4)+
    geom_line()+
    geom_hline(yintercept = 0, linetype = 2)+
    theme_minimal()+
    ggtitle("Estimated women - men difference")
```

</div>


# Recap, and Further Issues

## Recap

1. Splines are fit using a "basis" of polynomial curves which are scaled and summed to approximate a curve.
2. One of the strengths of using non-linear methods is that you can model relationships that are clearly non linear. Some weaknesses is that it is more challenging to intepret the results, and we need to be careful of over-fitting.
3. You can fit a spline using generalized additive models, specifically as implemented in the `mgcv` package.

## Things you'll want gams to do, and they can

### Random Intercepts

Each speaker has their own personal average F1 value for their [ay] that won't be perfectly captured by their demographic features. 
For this, you'll want to fit random intercepts by speaker, *and you can*.

### Random *smooths*

Each speaker will also have their own personal F1 *trajectories*, and to capture this, you'll want to fit random smooths (like random slopes), *and you can*.

### Autocorrelation

Importantly, the data we're modelling aren't just a random association of data points between time stamps and F1 values, like this plot shows.
<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay0_df %>%
  filter(dob %in% c(1964, 1965))%>%
  group_by(idstring, dob, gender, n)%>%
  summarise(F1 = mean(F1),
            t_prop = mean(t_prop)) -> ay_trajectory

ay_trajectory %>%
  ggplot(aes(t_prop, F1))+
    geom_point(aes(group = idstring))+
    #stat_smooth(method = "lm")+
    theme_bw()+
    ggtitle("dob = 1964, 1965, linear model")
```

</div>

Each F1 value is directly connected to an F1 value at the next time point.
That is, they are auto-correlated.
<div style = "width:85%;margin-right:auto;margin-left:auto;">
```{r fig.width=2.5, fig.height = 1.5, echo = F, out.width = "100%"}
ay0_df %>%
  filter(dob %in% c(1964, 1965))%>%
  group_by(idstring, dob, gender, n)%>%
  summarise(F1 = mean(F1),
            t_prop = mean(t_prop)) -> ay_trajectory

ay_trajectory %>%
  ggplot(aes(t_prop, F1))+
    geom_point()+
    geom_line(aes(group = idstring))+
    #stat_smooth(method = "lm")+
    theme_bw()+
    ggtitle("dob = 1964, 1965, linear model")
```

</div>
You can capture this with gams also!

### Multidimensional Smooths

The most fun thing to do with gams is to fit multidimensional smooths. 
For example, how does the trajectory of F1 change over dates of birth?
I've included the code for that model here, just so you can see how simple an extension it is of what we've already done.

```{r eval = F}
big_ay_model <- gam(F1 ~ te(t_prop, dob, k = 10)+
                       s(idstring, bs = 're'), 
                     data = ay0_m)
```


```{r echo = F}
ay_pred <- read_rds("data/ay_pred.rds")
```

```{r echo = F}
ay_pred %>%
  ggplot(aes(t_prop, fit))+
    geom_line()+
    transition_states(dob)+
    theme_minimal()+
    ease_aes("cubic-in-out")+
    ggtitle("Date of Birth {closest_state}") -> anim_2d
#anim_save(filename = "anim_2d.gif", animation = anim_2d)
```

<div style = "width:85%;margin-right:auto;margin-left:auto;">
![](figures/anim_2d.gif)
</div>


[^1]: It has low variance, but high bias.
[^2]: The number of basis functions is = knots + degree - 2. The "degree" of the basis functions controls how the curves connect smoothly at the knots. A degree of 3 makes the *rate of change* smooth, and a degree of 4 also makes the *acceleration* smooth. A degree of 3 or 4 is usually what you'll want to be working with.
